optimization task
MURKA: Multi-Reward Reinforcement Learning with Knowledge Alignment for Optimization Tasks
Optimization plays a central role in Operations Research (OR) and numerous industrial applications, yet automating the end-to-end process of translating natural language descriptions into executable optimization programs remains a formidable challenge. While recent efforts have applied Large Language Models (LLMs) to this task, existing approaches are hindered by high inference costs, limited robustness across domains, and weak verification mechanisms. In this work, we propose MURKA, a reinforcement learning and knowledge distillationbased framework that enhances LLM-driven optimization modeling via collaborative agent alignment. MURKA orchestrates three specialized agents--Extractor, Solver, and Checker--to achieve accurate problem understanding, robust formulation, and verifiable execution. The Extractor is trained using group relative policy optimization with a composite reward function that incorporates semantic correctness and execution fidelity.
Interior Point Solving for LP-based prediction+optimisation
Solving optimization problems is the key to decision making in many real-life analytics applications. However, the coefficients of the optimization problems are often uncertain and dependent on external factors, such as future demand or energy or stock prices. Machine learning (ML) models, especially neural networks, are increasingly being used to estimate these coefficients in a data-driven way.